Engaging the SAP Solution Stack Vendors : General Sizing Best Practices and Approaches

1/4/2012 11:35:08 AM

Even in the best of cases, where much is known about a future solution’s peak transaction workload or typical end user’s work habits, SAP solution sizing still remains an iterative process, as much art as science. Understanding SAP’s architecture can pay big dividends, therefore, when it comes to the sizing process. Of course, to gain more than a cursory understanding of the internal architecture employed by mySAP components requires considerable training, coupled with a number of years of experience. But a basic understanding of the core concepts behind the operation of an SAP system will help smooth out some of the bumps during the sizing process. And this knowledge should help you in maintaining an apples-to-apples sizing comparison between different hardware and other SAP technology partners.

Each mySAP component can be architected to take advantage of something called a three-tiered client/server architecture. Many years ago, SAP realized the advantages of separating the application’s logic from the database. The three technology layers that came of this—database, application, and front-end client—came to be known as a three-tiered architecture. In breaking each layer out this way, each could be scaled, or grown, which at the time was a very different approach from the monolithic mainframe solutions of the day, where growth meant tossing out your current mainframe and lugging in a bigger one.

SAP also architected the three layers such that they could reside on a single physical machine, or could be combined in different ways. The result was a very flexible and—based on the number of SAP deployments—a very successful architecture. Today, every layer, even the database server, which handles all database transactions, can be scaled through products like Oracle’s 9i Real Application Clusters.

If we think of the database as the first layer in the three-tiered client/server architecture, the application component, called the Application Server, is the second layer. It is very common to see anywhere from 2 or 3 to perhaps 10 or 12 application servers in a single system. In this way, the system’s processing power is easily increased as a system’s utilization or requirement to host a greater number of users increases and therefore more horsepower is necessary. The runtime element of mySAP components is referred to as the kernel, which spawns a number of SAP work processes, each serving different functions—work processes are created specifically to support online users, background or batch processes, printing, database updates, and so on.

Another unique element of the application layer is the Central Instance, which takes care of handling database locks, interapplication server messaging, and other core housekeeping activities; without the Central Instance, there is no SAP installation. Oftentimes, the Central Instance (CI) actually runs with one of the Application Servers dedicated to servicing end-user transactions. For the most robust configurations, though, SAP allows us to actually relocate the CI to its own physical server. This is so that the very processor-intensive application servers can be granted the resources of an entire server, instead of being forced to share CPU and memory with the CI. In other words, a separate CI server helps ensure the system can respond instantly to the next request without waiting for resources (primarily CPU) to be freed from some non-CI related usage.

The third layer in the three-tiered client/server architecture is the presentation layer, which simply means front-end clients. It is with these front-end clients (desktops, laptops, wireless handheld devices, and so on), that users connect to the SAP system using a Web browser or SAP’s own user interface, the SAPGUI. In many cases nowadays, an additional tier, the Internet or a company’s intranet, exists, too. This tier actually resides between the application and client tiers, and in effect extends both the application logic and the network of mySAP solutions. Thus, a four-tier solution is born in these cases.

Although my focus thus far has been on three-tier environments, remember that SAP’s architecture is flexible and can easily be adopted to support two tiers as well. Simply put, if the database server and application server execute on the same physical server, you have a two-tier system. In this kind of environment, end users connect directly to the central instance, whereas in a three-tier environment, end users connect to a specific application server or pre-established group of servers called a logon group (though this connection is still intelligently handled by the CI). And whether you have architected a two-, three-, or four-tiered SAP system, all communication between the different tiers takes place over TCP/IP (with some exceptions in two-tier systems that leverage process-to-process communications, which are outside the scope of this introduction to SAP architecture).

Now, armed with a better understanding of what SAP architecture entails, let’s move into the next section where different sizing methodologies are put into practice to architect specific mySAP solutions.

Understanding Different Sizing Methodologies

In addition to a full-fledged top-to-bottom solution stack approach to SAP sizing, a number of other sizing methodologies and approaches are often undertaken by different SAP technology partners. The key to any valid sizing approach is to understand the workload being performed, so that a hardware configuration with the proper number and speed of CPUs, RAM, and disk drives can be assembled. Some sizing approaches are faster than others, though, at the expense of sizing precision. For example, many hardware vendors provide Budgetary Sizings based solely on the expected number of active users to be supported by a particular mySAP solution. In this way, a ballpark dollar figure can be gleaned early in an SAP project without requiring all the time and trouble of answering a comprehensive sizing questionnaire.

As I mentioned earlier, SAP AG provides its own rendition of a budgetary sizing by means of its Quick Sizer, an online tool most often leveraged for its ability to perform rapid user-based mySAP sizings. Available at http://service.sap.com/quicksizing , the Quick Sizer can also model mySAP solutions even more accurately through an analysis of transactions and resulting outputs, in the form of customer-provided quantity and structure-related data. For example, business requirements that can be described in terms of the number of expected financial documents, receipts, postings, average line items in a typical order, and so on to be processed or created annually will more accurately help an SAP sizing expert craft a hardware solution than will an online user-based sizing approach.

This brings us to Transaction-Based Sizing. As the name implies, this approach seeks to characterize and understand the nature of end-to-end functional transactions being executed as part of a particular mySAP component. In addition to the quantities and structures already mentioned, peak processing hours and peak throughput loads are also factored in, just as they should be in user-based sizing exercises. Great care needs to be taken to avoid underestimating the number of transactions to be performed by a particular system, though—it’s easy to shortchange the sizing exercise. In my own experience, I therefore try to do the following:

Understand what the new SAP system replaces, which can help me understand potentially how many users in various functional areas might be using SAP in the future. Again, though, great care must be taken not to confuse the limited capabilities of a legacy system with a new SAP solution. The SAP solution will generate many more transactions per user, due to its greater capabilities and ties back into other functional areas.
Define the peak transaction processing requirements, not just what the system will typically be doing day-to-day. In other words, it’s important to discover what a particular customer’s month-end or quarter-end close looks like from a transaction load perspective, and whether any seasonal peaks exceed even this load. Don’t forget to include both online and batch transactions.
Explicitly state assumptions. If a customer does not understand his batch job requirements, or is unclear as to reporting requirements, I will take a cut at this based on my own experience, and document my assumptions. In this way, if the customer later learns what his exact requirements are, it is a simple matter to refine the sizing document (and therefore avoid accidentally doubling or tripling a load that had been previously extrapolated but not clearly identified).
Determine transaction types and weighting. Not all transactions are equally “heavy” in the eyes of SAP. Financial transactions, for example, may only consist of four dialog steps whereas SD transactions are five to six times heavier. Thus, not only do I determine the types of transactions that will occur on a system, but I also seek to convert or normalize all transactions into a similar genre (like SD, if I’m working with R/3), by weighting many light transactions into heavier SD transactions, and so on.

Other sizing approaches are quite common, too. A Delta Sizing approach, for instance, is quite useful for customers already live on a particular SAP product. The customer’s SAP Basis or adept SAP Operations team can be easily directed through SAP’s Computer Center Management System to identify the load observed real-time and historically in terms of the number of dialog steps processed, so that planned changes to the system (like adding an incremental number of users or transactions) can be intelligently extrapolated.

The final and most demanding sizing process that I am aware of is called something akin to “customer-specific sizing benchmarks” or “customer performance testing” or “proof-of-concept tests.” Regardless of the label, these sizing exercises take much of the assumption and to some extent guesswork out of sizing, replacing them instead with hard facts as to the load that a particular SAP Solution Stack is capable of bearing. Although a Proof-of-Concept, or POC, can be time-consuming (not to mention expensive), the resulting peace of mind is compelling. POCs share much in common with stress tests and load tests, which are executed prior to Go-Live to ensure that a production system is indeed capable of meeting performance metrics and other service-level agreements made between the IT department and an enterprise solution’s customers, its end users. For example:

A POC and a stress test are both focused on testing the performance and scalability of an SAP product.
In both cases, testing is usually initiated from single-user tests, and then scaled to a larger and larger number of front-end clients that eventually represent what a customer will expect in the real world.
To perform a POC or stress test, either the actual pre-production system or a system configured identically to it must be installed, configured, and tuned.
Real-world data, and plenty of it, is required. In the early stages of sizing, this is often the biggest factor in pulling off a successful POC, as good data is hard to come by, much less lots of good data.
Access to onsite sizing and POC professionals can be a challenge, depending on the solution stack you wish to test.
The overall expense can seem prohibitive, but as with an insurance policy, you likely prefer to spend a little today to save a lot down the road.

Because of these factors, in the end I’ve seen more user-based SAP solution sizing than anything else. Of course, transaction-based sizing and to a lesser extent, proof-of-concept testing will always be popular when risk is the highest. This is especially true for large or otherwise complex mySAP architectures, where transaction-based sizing should represent a minimum requirement of sorts, so as to both accurately and conservatively size your SAP project. And other sizing approaches exist, too, that target specific unknowns. For example, “characterization testing” seeks to test a specific function like batch processing or reporting, to learn how much horsepower needs to be available to meet the required minimum window of time to complete the batch processing or reporting. In those cases where SAP allows a process to be broken down into parallel processes and executed concurrently (called parallelization), such characterization is particularly important.

Sizing Tools, Practices, and Assumptions

SAP’s Quick Sizer, along with all hardware vendors’ SAP sizing tools, must make assumptions regarding what you seek in a solution. One of the most important assumptions that you need to verify with your hardware vendor involves how your specific SAP workload is distributed among the servers in your solution. Each vendor and their SAP sizing tools makes assumptions like these:

The load borne by a system architected for three tiers is often split 33/67 or 25/75, between the database server and the Central Instance combined with all application servers, respectively. Verify these numbers are consistent if you are working with multiple hardware vendors.
Batch and user/online loads may be distributed to dedicated servers, or shared. Verify how the tool addresses this, if at all.
When clustering, each node in a cluster can be configured to perform work while running in “normal” non-failover mode. Verify what both the normal and failover workloads look like for each cluster node.

A sizing tool must also make assumptions as to the specific version of a database release, operating system version, and even mySAP release! My advice is to verify with your hardware or software vendor that any specific SAP Solution Stack components you require are indeed addressed by their toolsets and approaches. It makes little sense, for example, for a hardware vendor to use its SAP/UNIX sizer for a specifically requested Windows solution. The same goes for specific versions of databases and mySAP components—each version has different processing, memory, and often even disk requirements. Using a tool incapable of addressing your specific solution stack makes the output derived from that tool suspect.

Similarly, attention needs to be given to the methodology employed to determine how large your SAP database will be in two or three years, and what the growth chart will look like over time. Different database sizing approaches have evolved over the years; verify that your prospective vendors are using the same method, or in some other way guide them toward agreeing on a number that makes sense to everyone. I like to size an SAP database for three years’ growth, for example.

Another important assumption has to do with system utilization numbers. When a benchmark is run, the system being tested is usually stressed up to the point where the average response time observed by each user is just under two seconds. Doing so typically pushes CPU utilization to a maximum of 99 or 100%. In the real world, though, when sizing SAP solutions, hardware vendors need to make assumptions as to what kind of utilization thresholds you are comfortable with. These vary depending upon the vendor, but typically resemble the following de facto standards:

Servers are sized such that the average CPU utilization over time is 65% or so. In other words, the system might spike to 100%, or sit nearly idle occasionally, but generally will hover around the 65% mark.
Of this 65% utilization, fully half is dedicated to user-based dialog processing, and the other half to a combination of batch processing, printing, interface processing, and reporting.
The remaining 33% worth of “capacity” remains available to provide capacity to initiate new work with minimal delay which, in turn, results in predictably good response times. This extra capacity also helps to address unforeseen or unplanned future workloads.

Be careful that each of these assumptions is clearly documented in the sizing documents you receive from each hardware vendor. Differences in assumptions can make an enormous difference in the solution proposed by one vendor over another, for instance. I know of one hardware vendor in particular who in the past has deviated from these standards for the express purpose of making their hardware solutions seem more robust than the competition’s: By sizing for the full 100% capacity of a server, their solutions therefore appeared to require less RAM and CPU processing power. This helped them undercut other hardware vendor’s proposals when in fact their less-than-customer-focused tactics only left their clients with premature performance problems that eventually had to be addressed before Go-Live.

Best Practices Regarding System Landscape Design

To ensure apples-to-apples sizings, I recommend that you plainly direct each potential hardware vendor to size for identical system landscapes. In other words, do not leave this up to their discretion (unless your goal is to simply see at a high level what kind of unique solution each vendor can craft to solve your business problem). For example, you may want to be explicit about how each vendor should address high availability. It is better to indicate “include clusters for HA and SQL Server log shipping for DR” rather than only stating a 99.9% uptime requirement and allowing each vendor to determine how to address this themselves.

And be clear as to which SAP system landscape components you want to see included in your sizing. A four-system landscape can be interpreted in many different ways—one vendor might make the fourth system training, another configures a technical sandbox, and a third vendor gives you a staging system. The same approach is true for database sizing—clearly indicate where you wish to host copies of your full production size database, and where smaller development or sandbox databases are appropriate.

With regard to the system landscape, you also must be clear about whether a fourth tier is required, and what exactly that entails. And you need to cover landscape deployment options, like using instance stacking to install multiple systems on one physical server. Stacking is quite common in the Unix world of SAP (and to a much lesser extent, Windows), where development, test, and training instances might all reside on one very capable server rather than separate servers. Finally, you should help push each vendor toward a consistent standard for sizing the various systems within the system landscape. Specify, for example, where minimal server, disk subsystem, and other hardware components should be employed. Your Test system should be able to support a specific number of users, or a specific percentage of the load to be eventually borne by production. Similarly, your development system should be configured robustly enough to keep your development team from walking off the job—give your hardware vendors a specific target, like the ability to support 20 high-activity developers, and this will help you to continue to support an apples-to-apples sizing comparison.